Dynamic document clustering based on genetic algorithm 基于遗传算法的动态文本聚类
On document clustering based on fuzzy c - mean algorithm 均值算法文档聚类问题的研究
A research of document clustering algorithm based on vector space model 基于向量空间模型的文本检索系统
The combination of document clustering technique and web search engine has become a hot - spot in document mining area 文本聚类技术和网络搜索引擎服务相结合,已经成为文本挖掘领域的一个热点研究课题。
It uses the document clustering to find multi - interests of each user and the accuracy of describing user ' s interests is improved 利用文档聚类发现用户的多个子兴趣主题,从而提高对用户兴趣偏好描述的准确性。
The thesis proposes a new document clustering method that uses a model named document index graph to represent chinese documents 本文提出的一种新的文本聚类方法,采用一种称为文档索引图的结构来构建中文文本表示模型。
But there are seldom researches in using document clustering technique into chinese web documents and cooperating with chinese web search engine services 但是,把文本聚类技术应用于中文web文档,与中文搜索引擎服务相结合的研究仍然比较匮乏。
Our experimental evaluations show that our methods surpass the nmf not only in the easy and reliable derivation of document clustering results , but also in document clustering accuracies 实验结果显示,在聚类的容易度、准确度、时间复杂度上均取得较nmf算法更合理的效果。
To improve document clustering , a document similarity measure based on cosine vector and keywords frequency in documents is proposed , but also with an input ontology 为了改进文本聚类的效果,提出了将领域知识本体和文本关键词词频相结合的基于余弦向量的文本相似性测度方法。
2 . analyze the drawbacks of traditional transaction identification methods , and propose an improved one , which combines content data of web pages , and applies document clustering algorithm in this process 2 .在分析传统事务识别方法不足的基础上,结合网页内容对事务识别方法进行适当的改进,将内容挖掘中的文本聚类算法引入到事务识别的过程中。
Document clustering (or Text clustering) is automatic document organization, topic extraction and fast information retrieval or filtering. It is closely related to data clustering.